Efficient Bayesian Clustering for Reinforcement Learning

نویسندگان

  • Travis Mandel
  • Yun-En Liu
  • Emma Brunskill
  • Zoran Popovic
چکیده

A fundamental artificial intelligence challenge is how to design agents that intelligently trade off exploration and exploitation while quickly learning about an unknown environment. However, in order to learn quickly, we must somehow generalize experience across states. One promising approach is to use Bayesian methods to simultaneously cluster dynamics and control exploration; unfortunately, these methods tend to require computationally intensive MCMC approximation techniques which lack guarantees. We propose Thompson Clustering for Reinforcement Learning (TCRL), a family of Bayesian clustering algorithms for reinforcement learning that leverage structure in the state space to remain computationally efficient while controlling both exploration and generalization. TCRL-Theoretic achieves near-optimal Bayesian regret bounds while consistently improving over a standard Bayesian exploration approach. TCRLRelaxed is guaranteed to converge to acting optimally, and empirically outperforms state-of-the-art Bayesian clustering algorithms across a variety of simulated domains, even in cases where no states are similar.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Bayesian Nonparametric Methods for Model-Free Reinforcement Learning in Centralized and Decentralized Sequential Environments

Efficient Bayesian Nonparametric Methods for Model-Free Reinforcement Learning in Centralized and Decentralized Sequential Environments by Miao Liu Department of Electrical and Computer Engineering Duke University

متن کامل

Efficient Structure Learning in Factored-State MDPs

We consider the problem of reinforcement learning in factored-state MDPs in the setting in which learning is conducted in one long trial with no resets allowed. We show how to extend existing efficient algorithms that learn the conditional probability tables of dynamic Bayesian networks (DBNs) given their structure to the case in which DBN structure is not known in advance. Our method learns th...

متن کامل

Multiagent Planning with Bayesian Nonparametric Asymptotics

Autonomous multiagent systems are beginning to see use in complex, changing environments that cannot be completely specified a priori. In order to be adaptive to these environments and avoid the fragility associated with making too many a priori assumptions, autonomous systems must incorporate some form of learning. However, learning techniques themselves often require structural assumptions to...

متن کامل

Transfer Learning for Reinforcement Learning with Dependent Dirichlet Process and Gaussian Process

The ability to transfer knowledge across tasks is important in guaranteeing the performance of lifelong learning in autonomous agents. We propose a flexible Bayesian Nonparametric (BNP) model based architecture for transferring knowledge between reinforcement learning domains. A Dependent Dirichlet Process Gaussian Process hierarchial BNP model is used to cluster different classes of source MDP...

متن کامل

Probabilistic Reasoning through Genetic Algorithms and Reinforcement Learning

In this paper, we develop an efficient approach for inferencing over Bayesian etworks by using a reinforcement learning controller to direct a genetic algorithm. The random variables of a Bayesian network can be grouped into several sets reflecting the strong probabilistic correlations between random variables in the group. We build a reinforcement learning controller to identify these groups a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016